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ABSTRACT 

We use the statistics of regions above or below a temperature threshold (excur- 
sion sets) to study the cosmic microwave background (CMB) anisotropy in models 
with primordial non-Gaussianity of the local type. By computing the full-sky spatial 
distribution and clustering of pixels above/below threshold from a large set of simu- 
lated maps with different levels of non-Gaussianity, we find that a positive value of the 
dimensionless non-linearity parameter /nl enhances the number density of the cold 
CMB excursion sets along with their clustering strength, and reduces that of the hot 
ones. We quantify the robustness of this effect, which may be important to discriminate 
between the simpler Gaussian hypothesis and non-Gaussian scenarios, arising either 
from non-standard inflation or alternative early-universe models. The clustering of 
hot and cold pixels exhibits distinct non- Gaussian signatures, particularly at angular 
scales of about 75 arcmin (i.e. around the Doppler peak), which increase linearly with 
/nl- Moreover, the clustering changes strongly as a function of the smoothing angle. 
We propose several statistical tests to maximize the detection of a local primordial 
non-Gaussian signal, and provide some theoretical insights within this framework, in- 
cluding an optimal selection of the threshold level. We also describe a procedure which 
aims at minimizing the cosmic variance effect, the main limit within this statistical 
framework. 
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1 INTRODUCTION 

Since some level of non-Gaussianity is generically expected 
in all inflation models, due to interactions of the inflaton 
with gravity and/or from inflaton self-interactions, seek- 
ing for deviations from the Gaussian paradigm has re- 
cently become a major effort - and a minor industry - in 
cosmology. Properties of the primordial perturbations are 
uniquely imprinted in the cosmic microwave background 
(CMB) anisotropy distribution; hence, its analysis is a pow- 
erful way of looking at the speciflcs of the inflationary models 
(or alternatives to inflation). At the present time, the main 
challenge is either to detect or to constrain mild or weak de- 
partures from primordial Gaussian initial conditions, as the 
level of non-Gaussianity predicted in the simplest single- field 
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slow-roll inflation is slightly below the minimum value de- 
tectable by the Planck satellite, and not within reach of 
future galaxy surveys. This is essentially why primordial 
non-Gaussianity is regarded as one of the most promising 
probes of the inflationary universe (Komatsu et al. 2009b), 
and it has received a recent boost, both theoretically and ob- 
servationally, mainly because of the lyilkinson Microwave 
Anisotropy Probe (WMAP) data which seems to favor a 
slightly positive value of the dimensionless non-linearity pa- 
rameter /nl (Yadav & Wandelt 2008; Komatsu et al. 2009a, 
2010; Smith et al. 2009). 

From the theoretical side, much effort has been directed 
towards the development of competing scenarios for pertur- 
bation generation which go beyond the single-field slow-roll 
paradigm, for instance by the inclusion in the Lagrangian 
of non-trivial kinetic terms, the presence of more than one 
light field during inflation, the temporary violation of slow- 
roll, or a non-adiabatic initial vacuum state for the inflaton. 



2 Graziano Rossi et al. 



Examples are the curvaton model, the modulated reheat- 
ing, DBI or ghost inflation, or multi-field scenarios, some of 
which imply large departures from Gaussianity (see, for in- 
stance, among the plethora of papers on this subject, Linde 
& Mukhanov 1997; Lyth & Wands 2002; Acquaviva ot al. 
2003; Lyth, UngareUi & Wands 2003; Maldacena 2003; Al- 
ishahiha et al. 2004; Arkani-Hamed et al. 2004; Bartolo et 
al. 2004; Dvali, Gruzinov & Zaldarriaga 2004; Chen 2005; 
Seery & Lidsey 2005; Bartolo, Matarrese & Riotto 2006; 
Lyth & Riotto 2006; Sasaki ct al. 2006; Crominolli ct al. 
2007; CremineUi & Senatore 2007; Koyama et al. 2007; Buch- 
binder et al. 2008; Chen et al. 2008, 2009; Lehners & Stein- 
hardt 2008; Matarrese & Verde 2008; Sasaki 2008; Bartolo 
& Riotto 2009; Brandcnbcrgcr 2009; Naruko & Sasaki 2009; 
Sciiatorc, Tassev & Zaldarriaga 2009; Silvestri & Trodden 
2009; Bartolo, Matarrese & Riotto 2010). 

Prom the observational point of view, the main goal 
is to constrain the level of primordial non-Gaussianity di- 
rectly from a real data set, and this is usually achieved by 
constructing and applying a variety of non-Gaussian esti- 
mators such as the 3-point function (Hinshaw et al. 1994; 
Gangui et al 1994), the genus statistics or the topological 
genus density (Coles 1988; Gott ct al. 1990; Smoot et al. 
1994; CoUey & Gott 2003; Park 2004; Gott et al. 2007), 
the other Minkowski functionals (Schmalzing & Gorski 1998; 
Winitzki & Kosowsky 1998; Banday, Zaroubi & Gorski 2000; 
Hikage et al. 2006, 2008b; Matsubara 2010), the bispec- 
trum and trispcctrum (Spergcl ct al. 2007; Komatsu et al. 
2009; Rudjord et al. 2009; Liguori et al. 2010), tensor modes 
(Coulson, Crittenden & Turok 1994), wavelets (Cabella et 
al. 2005; Curto ct al. 2009; Vielva & Sanz 2009), pixel and 
peak statistics (Adlor 1981; Bond & Efstathiou 1987; Coles 
and Barrow 1987; Kogut ct al. 1995, 1996; Barreiro et al. 
1997, 1998; Heavens 1998; Heavens & Sheth 1999; Heavens 
& Gupta 2001; Hernandez-Monteagudo et al. 2004; Rossi 
ct al. 2009; Hou ot al. 2010), phase correlations, multifrac- 
tals, and so forth (see also Komatsu, Spergel & Wandelt 
2005; Chen & Szapudi 2006; Munshi & Heavens 2010). In 
this process, many observational challenges and experimen- 
tal artifacts come into play; therefore, it is perhaps not sur- 
prising that controversial results and a long list of anomalies 
have been reported so far, ranging from a low value of the 
quadrupole till North-South or parity asymmetries, strange 
alignments in the data, and much more (see for example Chi- 
ang et al. 2003, 2007; Tcgmark et al. 2003; de Oliveira-Costa 
et al. 2004; Erikscn ct al. 2004, 2007; Schwarz ct al. 2004; 
Cruz et al. 2005, 2006, 2007, 2008; Land & Magueijo 2005, 
2007; Naselsky et al. 2005; Copi et al. 2006, 2007; Vielva et 
al. 2007; Gurzadyan et al. 2008; Pietrobon et al. 2009; Rath 
et al 2009; Kim & Naselsky 2010). 

Deviations from Gaussian initial conditions (if any) also 
carry important consequences on many aspects of the large- 
scale structure (LSS) of the Universe, and galaxy surveys can 
provide constraints on non-Gaussianity competitive with 
those from the CMB alone. There are in fact modifications 
in the statistics of voids (Kamionkowski, Verde & Jimenez 
2009), in the distribution of neutral hydrogen and in the in- 
tergalactic medium (Viel et. al 2009), in the high-mass tail 
of the halo distribution (Chiu ot al. 1998; Matarrese, Vcrdc 
& Jimenez 2000; Sefusatti & Komatsu 2007; LoVerde et al. 
2008), in the large-scale skewness of the galaxy distribution 
(Chodorowski & Bouchet 1996), in the number counts of 



clusters and of density peaks (Dcsjacqucs ot al. 2009; Jcong 
& Komatsu 2009), in the measurement of the scale depen- 
dence of the bias of LSS tracers (Carbone et al. 2008; Dalai 
et al. 2008; Verde & Matarrese 2009; Desjacques & Seljak 
2010), in the reioruzation history (Crociani et al. 2009), in 
the galaxy power spectrum and bispectrum (Scoccimarro 
2000; Scoccimarro et al. 2004; MangiUi & Verde 2009), in the 
topology (Park et al. 1998, 2005; Gott et al. 2008; Hikage et 
al. 2008a), and in the abundance and clustering of galaxies 
and dark matter halos (Verde et al. 2001; Afshordi & ToUcy 
2008; Grossi et al. 2008; LoVerde et al. 2008; Matarrese & 
Verde 2008; McDonald 2008; Slosar et al. 2008; Taruya et 
al. 2008; Pillepich et al. 2010). 

Despite all these remarkable theoretical and observa- 
tional efforts, till date the experimental detection of a sig- 
nificant deviation from the Gaussian paradigm remains still 
challenging and not convincing. In this respect, we need 
to explore alternative statistics more sensitive to deviations 
from Gaussianity, and to search for unique features which 
may allow one to distinguish among the myriad of inflation 
models available in the literature. It is important to adopt 
different and complementary statistical approaches, and not 
just a single view, because non-Gaussianity can take innu- 
merable forms. In fact, while Gaussian random processes are 
theoretically desirable since they are the only ones for which 
the knowledge of all spectral parameters completely deter- 
mines all the statistical properties, as soon as we introduce 
departures from Gaussianity a more complicated scenario 
emerges, and there is no such statistics which describes fully 
and uniquely the non-Gaussian nature of a sample. In partic- 
ular, moving away from standard estimators like the bispec- 
trum, trispectrum, three and four-point functions, skewness, 
etc, we are interested here in rare events, which can often 
maximize deviations from what is predicted by a Gaussian 
distribution. 

The main goal of the present work is to extend and ap- 
ply the statistics of the excursion sets, regions above or be- 
low a temperature threshold, to models with primordial non- 
Gaussianity. Specifically, we focus on the local parametrizar 
tion of non-Gaussianity (Salopek & Bond 1990), by includ- 
ing quadratic corrections to the curvature perturbation. We 
simulate a large set of full-sky maps with different /nl val- 
ues, and compute the number density and the spatial clus- 
tering of the CMB excursion set regions. We also provide 
the theoretical formalism to interpret our results. The ex- 
cursion set statistics is fully characterized in the context of 
Gaussian random fields (Kaiser 1984; Bardeen et al. 1986), 
and it has been used in a variety of studies (see for example 
Jensen & Szalay 1986; Bond & Efstathiou 1987; Barreiro 
et al. 2001; Kashlinsky et al. 2001 and references therein). 
There are also some extensions to non-Gaussian conditions 
in the literature (i.e. Coles & Barrow 1987; Coles 1988; Bar- 
reiro et al. 1998). Our analysis differs from those of the 
previous authors primarily because we use a more realis- 
tic model for non-Gaussianity supported by /nl type sim- 
ulations, and because we also propose some new statistical 
tools, tests, and theoretical insights within this framework. 
In particular, while in precedent studies it has always been 
shown that the Gaussian correlation function of the excur- 
sion sets and peaks (a subset of the excursion sets) is easily 
distinguishable from a non-Gaussian one, even if the under- 
lying bispectra are not statistically different (i.e. Kogut et 
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al. 1995; Barreiro et al. 1998; Heavens & Gupta 2001), we 
suggest here that it may not be the case if the model of non- 
Gaussianity is of the local type, and the resolution adopted 
is not optimal. 

Our work is also motivated by another reason. In a pre- 
vious analysis (Rossi et al. 2009), we compared the pixel 
clustering statistics - properly extended to handle inho- 
mogeneous noise - against WMAP five-year data, and we 
detected deviations from the Gaussian theoretical expecta- 
tions. In particular, we found a remarkable difference in the 
clustering of hot and cold pixels at relatively small angular 
scales. A similar trend has also been reported in the litera- 
ture by Tojeiro et al. (2006), and by Hou, Banday & Gorski 
(2010), although at much larger scales. Whether or not this 
discrepancy may arise from primordial non-Gaussianity of 
the local type is another key question of this analysis. 

The layout of the paper is as follows. Section[2]contains 
the theoretical tools developed and used in this study. In 
Section [2. II we briefly describe the local /nl model. In Sec- 
tion [2]2] we explain how the simulated non-Gaussian maps 
are constructed. In Section 12.31 we provide the basic for- 
malism for the excursion sets statistics, in the context of 
/nl scenarios. Expressions for the one- and two-dimensional 
probability distribution functions (PDFs) are given, under 
the assumption of weak non-Gaussianity; this is done via a 
perturbative approach by the multidimensional Edgeworth 
expansion around a Gaussian distribution function. Those 
PDFs are then used to characterize the number density and 
the clustering statistics above/below threshold as a func- 
tion of /nl (some details are provided in Appendix [Aj . In 
Section [2.41 we relate the excursion sets formalism to other 
commonly used topological estimators. In Section [S] com- 
putations of the number density and the clustering statis- 
tics above/below threshold from non-Gaussian maps are pre- 
sented and interpreted according to our theory predictions. 
Specifically, Section 13.11 shows the abundance of the excur- 
sion set regions in a variety of ways, while in Section 13.21 
we highlight some statistical tests developed using the num- 
ber density. We also argue that there are optimal thresholds 
which can maximize the non-Gaussian contribution, as well 
as levels which do not allow to distinguish a Gaussian sig- 
nal from a non-Gaussian one. In Section [3.31 we present the 
clustering of hot and cold pixels for one of the optimal tem- 
perature thresholds as a function of the smoothing scale, 
and in Section [3.41 we propose a new statistical test derived 
from the clustering statistics. This procedure aims at mini- 
mizing the cosmic variance effect, and involves the compu- 
tation of the power spectrum for any given CMB map. A 
final part (Section 2]) summarizes our findings, and high- 
lights ongoing and future work. We leave in Appendix |Bl 
[C] and [D] some technical details regarding experimental arti- 
facts such as inhomogeneous noise, incomplete sky coverage, 
errorbar estimates and confusion effects caused by spurious 
non-Gaussianities; all these experimental complications will 
be examined in more detail in the forthcoming publications. 



2 THEORETICAL FRAMEWORK 

In this paper we study the statistics of the excursion sets in 
CMB temperature maps, to examine its sensitivity to pri- 
mordial non-Gaussianity. Even though the chosen statistics 



should be sensitive to a wide class of non-Gaussian fields, in 
the present work we consider the local /nl model in detail. 

2.1 The local /nl model 

Considerable interest has been recently focused on local type 
/nl, by which the non-Gaussianity of Bardeen's curvature 
perturbations is locally characterized in real space, up to 
second order, by: 

$(x) = 0(x) + /nl[</''(x)-(,^^)] (1) 
and in Fourier space by 

$(k) = 0(k) + f^^J^ 0(k + k') 0(k'), (2) 

where is a Gaussian field (Salopek & Bond 1990; Gangui 
et al. 1994; Verde et al. 2000; Komatsu & Spergel 2001). 
The local type non-Gaussianity is sensitive to the bispec- 
trum _B$(fci, ^2, fca) with squeezed configuration triangles 
(i.e. fci <JC ^2 ~ fcs; Babich et al. 2004), defined as 

('I'ki'l'ks'I'ka) = 5E(ki23)B<i>(fei,A;2,fc3) 

= 5E(kl23)/NLF(fcl,fc2,fc3) (3) 

where Su is the Dirac delta, ki23 — ki -)-k2-|-k3 and /nl is a 
dimensionless parameter (or more generally a non-linearity 
function), while the function F describes the dependence on 
the shape of triangular configurations defined by the three 
wave-numbers k\,k2,kz. 

This parametrization was originally motivated by the 
single-field inflation scenarios, and it became quite popular 
shortly thereafter because it is possible to cast many in- 
flationary models, including the curvaton scenario (Lyth et 
al. 2003), in the form of equation ([Sjl; namely, one can ex- 
press departures from non-Gaussianity in terms of a generic 
function F, which may assume different model-dependent 
shapes and it is broadly classified into three classes (local 
squeezed, non-local equilateral, orthogonal), and the param- 
eter or function /nl. Alternatives to inflation like New Ekpy- 
rotic and cyclic models are also expected to produce a large 
level of non-Gaussianity of this type (Koyama et al. 2007; 
Buchbinder et al. 2008; Lehners & Steinhardt 2008). There- 
fore, the power of this formalism is that it allows one to rule 
out a large class of models by putting constraints on /nl, 
and to reconstruct the inflationary action starting from a 
measurement of a few observables like /nl itself. 

Note that in this paper we always use /nl in its local 
meaning, even if the usual superscript local is not present, 
and also that there are two distinct definitions of /nl in 
the literature, corresponding to a CMB and a LSS conven- 
tion. In the CMB convention adopted here, the local non- 
Gaussianity is defined by equations (IH3|I with the curvature 
perturbations $ evaluated at early times during the matter 
domination era, when their value was constant. In the LSS 
convention, one usually assumes $ to be the value linearly 
extrapolated at present time, and therefore it includes the 
late-time effect of the accelerated expansion in a cold dark 
matter cosmology with a cosmological constant (LCDM). 

Current limits on the primordial non-Gaussianity pa- 
rameter /nl at 95% confidence level (CL) from the CMB 
alone are claimed to be —4 < /nl < 80 (Smith et al. 2009), 
-18 < /nl < 80 (Curtoet al. 2009), -36 < /nl < 58 (Smidt 
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et al. 2010), and -10 < /nl < 74 (Komatsu et al. 2010). 
Those obtained from the LSS are similarly competitive; see 
for instance -29 < /nl < 70 by Slosar et al. (2008). 



2.2 Simulating non-Gaussian maps 

The simulated non- Gaussian maps used in this analysis are 
constructed following the method outlined in Liguori et al 
(2003). The main point of their procedure is to calculate the 
spherical harmonic coefficients a^ni's as an integral in real, 
rather than in Fourier space. Briefly, the CMB temperature 
fluctuations are expanded in terms of spherical harmonics as 
5T{n) = a-i^Yi^ (n) . The a^m's are then computed by 
convolving the primordial potential fluctuations with the ra- 
diation transfer function At (independently computed using 
CMBFAST developed by Seljak & Zaldarriaga 1996), as 



47r(-j)* 

til 

2n^ 



(2^ 



$(k)A,(fc)r;^(k) 



dk cE>to(fc) Af(fc) 



where 
"l>to(fc) 



drr $ftn(r)Af(r) 



d^^ $(k) Yi^{k) 



(4) 



47r(i)' / dr <l>fm(r) jf(fcr) 



27r2 



dkk $em{k)ji{kr) 



(5) 
(6) 



Mr) 



dkk^ Ae{k)ji(kr) 



(7) 



"l>(k) is the Fourier transform of the real space potential 
<E>(x) defined in equations ([T| and ([2]), $em{r) is the real 
space harmonic potential, ^eni{k) is its inverse and je's are 
spherical Bessel functions. 

In presence of non-Gaussianity of the local type, from 
equations (|H6p and for a constant /nl it follows immediately 
that 



G I J- f] 
dfm = m + /nl df 



NL 



NL 



(8) 
(9) 



where in both equations the first right-hand side terms are 
the Gaussian contributions, while the second ones account 
for the /nl part. Note that those terms are integrals over 
the corresponding potentials (i.e. "1>^ involves only, while 
accounts for tp^ - see again equations [T] and [Sjl . The 
Gaussian part in ^ is obtained in real space from 



$?m(0 = 



dr' r'^ nen^{r')We{r, r') 



(10) 



where nt^{r) are independent complex Gaussian variables, 
Wi{r,r') are filter functions defined by 



Wi{r,r') = - / dfcfc" y^P^{k)ji{kr)ji{kr') 



(11) 



and obtained as in Chingangbam & Park (2009), and 
P#(fc) is the primordial power spectrum adopted. After 
computing the Gaussian part of the potential <?!>(x) — 
'^i^^fm{r)Ytyn{r) it is straightforward to compute the 
corresponding /nl contribution, and eventually the non- 
Gaussian temperature fluctuations via equations Q and (|9]). 
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Our simulations are provided in the HEALPix scheme 
(Gorski et al. 1999) at a resolution of A^'sidc = 512, giving a 



total of 3145728 pixels separated on average by 



6.87 



arcmin. We adopt a standard LCDM cosmological model, 
with the WMAP 5-year best fit parameters (Komatsu et al. 
2009). An example of these realizations is shown in Figure 
[1] for a small patch of the sky (~ 10° x 10°). Regions be- 
low a temperature threshold v = 0.50 or above v = —0.50 
are set to zero, where f = ST /a, with a being the rms of 
the map and ST the temperature anisotropy. The left panel 
highlights the Gaussian case, the right panel shows the cor- 
responding non-Gaussian scenario with /nl ~ 500. A Gaus- 
sian smoothing with a full width at half-maximum (FWHM) 
of 30 arcmin is applied to those regions before clipping the 
field a.t u — ±0.50. Clearly, by visual inspection it is hard to 
distinguish between the two maps, although one can easily 
show that their underlying skewness is quite different. 



2.3 Excursion sets formalism in /nl models 

Given a CMB map with a temperature assigned to each 
point, an excursion set is the ensemble of all pixels with tem- 
peratures greater than a fixed threshold. The complemen- 
tary excursion set for temperatures lesser than a given level 
is symmetrically defined; in the Gaussian case, it is expected 
to give the same results as the corresponding hot excursion 
set. If the threshold under consideration is high enough, the 
excursion set is composed of many disjoint groups of pixels, 
each group surrounding one of the local maxima or tempera- 
ture peak (see Figure[T]). The excursion set regions are easily 
and unambiguously identifiable in the CMB sky rather than 
the distributions of peaks, and at high thresholds the num- 
ber of maxima and excursion sets coincides asymptotically. 
We are interested here in understanding how the number 
density and the clustering statistics of the excursion regions 
are modified in the presence of local, and relatively weak, 
non-Gaussianity. This theoretical framework will guide the 
interpretation of our numerical results presented in Section 
[S] from a large set of non-Gaussian simulations. 

In a full Gaussian sky and in the absence of pixel noise, 
the number density of regions above (below) a temperature 
threshold u is simply given by: 



npix(j^) 



iVpix,tot erfc(j//-/2) 



47r 



(12) 



where A^'pix.tot = 12N^i^^ is the total number of pixels in the 
map, at a resolution specified by the parameter A^sido. Equa- 
tion (|12|l follows immediately from an integration above (be- 
low) a level v oi a, one-dimensional Gaussian PDF. 

In presence of non-Gaussianity of the local /nl type, 
the theoretical formalism for the number density is compli- 
cated by the inclusion of an extra term which quantifies the 
role of /nl. Following Matsubara (1994, 2003) and Hikage, 
Komatsu & Matsubara (2006), for weak non-Gaussianity a 
perturbative approach by the multidimensional Edgeworth 
expansion around a Gaussian distribution function suggests 
that the expression for the number density will acquire an 
additional term: 



where 



(13) 




-0.4 -0.2 0.2 
(5T [mK] 
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Figure 2. CMB temperature distribution (mK units) in pres- 
ence of weak local non-Gaussianity, when no smoothing is applied. 
Points in the figure are averages over 200 non-Gaussian simula- 
tions with /nl = 100 and 500, errorbars are the corresponding 
1(T run-to-run estimates, and solid lines are from equation I I15I I 
for the two different /nl values. The average rms of 5T is 0.111 
mK. 



A, 



pix,tot 

47r 



. 6V27r 



{u^ ~ l)e- 



V2 



(14) 



The skewness parameter S'"' = {ST^) /a* needs to be eval- 
uated numerically; it contains the reduced bispectrum spe- 
cific to the non-Gaussian model - simplified for /nl con- 
stant, as given by Komatsu & Spergel (2001). Note that 
5(0) 

is an important parameter because it represents the 
leading order contribution to the non-Gaussianity. In fact 
^^(0) _ jj^L-^(6's), with A{6s) a numerical coefficient which 
depends on the adopted smoothing 9s- 

This implies that the underlying one-dimensional PDF 
in presence of local non-Gaussianity is given by: 

p{pL)A^i ^ ^e-'^'/^jl + - 3)}d^*, (15) 



27r 

where the first part on the right hand side of the equation 
is the usual Gaussian contribution, the second is the non- 
Gaussian term, and /i = ST/g is now used to indicate the 
threshold level. A plot of this distribution in units of the 
corresponding Gaussian PDF is provided in Figure [2l for 
/nl = 100 and 500, when no smoothing is applied. Points 
in the figure are averages over 200 realizations, errorbars 
are the la run-to-run estimates from the simulations, and 
solid lines are from equation (|15|) . Note that, although the 
non-Gaussian term in (|15|) is complicated by the inclusion 
of S'"' , S^"' itself is independent of the threshold level; this 
will be important for the next considerations. 

With the one-dimensional PDF at hand, a number of 
well-known properties in the context of Gaussian random 
fields, such as the mean size and frequency of occurrence of 
the excursion sets above a given level (Coles & Barrow 1987; 
Kogut et al. 1995), can be easily generalized to /nl models. 
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We will present this analysis in a following paper, while here 
we focus primarily on the pixel clustering statistics. 

The correlation of the excursion sets above a threshold 
V is given by (Kaiser 1984): 



where 



Pi = 



and 



d^i 



d/i2 p{m,fJ.2,'w) 



(16) 



(17) 



(18) 



with p{fii, fi2,w) being the two-dimensional PDF and w = 
w{9) = {(11^2) the correlation. 

There have been attempts in the literature to gener- 
alize equation (|16p to non-Gaussian cases. For example, 
Berry (1973), Jones (1996) and Barreiro et al. (1998) write 
p{pi,ti2,w) as: 

p{fii,fi2,w) = p(^i)(5d(mi - M2)w + p(mi)p(M2)(1 - ™) (19) 
so that (|16|) is simply given by: 

l + ^,ie)=w/Pi + {l-w). (20) 

Expression (|20|l implies that one can fully characterize the 
clustering statistics above (below) threshold using only the 
knowledge of the one-dimensional PDF (|15|) and the corre- 
lation. Unfortunately, this toy model cannot be applied in 
our context; equation (|2Up is valid when w is small, which is 
not true in our case. 

Instead, since we are interested in weak non- 
Gaussianity, we expect a bivariate Edgeworth expansion to 
provide a reasonably good description at low thresholds: 

1 r Ml + /4 ^ 2^1/^2"' 1 



H30 + Ho3 



: exp ■ 



where A = {iiifi2) = (M1M2) and 



H21 + 



2(1 - u)2) 



dfj.idfi2 (21) 



(mi 



Wfi2) 



3(mi 



- W^2) 



(1 - W2)2 



(22) 



H2l{tJ-l, IJ.2,w) 



Hi2{tl2, Ml, If) 
2w{lli — WII2) — {fJ.2 



Wfll) 



(1-^2)2 
(M2 — Wfli){fli — W^2)'^ 

(l-m2)3 



+ 



(23) 



Equation (|21[) is the two-dimensional version of the distri- 
bution (|15|l - see also Kotz, Balakrishnan & Johnson (2000) 
and Lam & Sheth (2009). Note that w and A must be eval- 
uated numerically. By inserting (|15|l and (|21|l into (|16p . it 
is possible to characterize the clustering strength of pixels 
above/below threshold for weak non-Gaussianity. 

When /nl = (i.e. in the Gaussian limit) equation \21\ 
reduces to the usual bivariate Gaussian distribution, since 
0-5'°' = and A = 0. Therefore (0 reduces to the well- 
known formula: 



d/i e ''^erfc 



V — WfJ, 



where 

w = w{e) = {5Ti5T2)/a^ c{e)/c{o) 
with 

(2^+1 



\/2(r 



' •(% r smooth 7-1O / / 

-CiWi Pi (cosf 



(24) 



(25) 



(26) 



Ci is the input power spectrum, and Wl^°° is the window 
function which includes all the additional smoothing. 

2.4 Relation to other topological estimators 

The excursion set statistics belongs to a more general class 
of geometrical estimators, which retain information on the 
spatial distribution of the non-Gaussian signal. In this re- 
spect, it is related to many other commonly used topologi- 
cal estimators. For example, since the distribution of peaks 
with CMB temperatures above/below a given threshold is a 
subset of the pixel distribution, there is a direct correspon- 
dence between the excursion sets and the peak statistics. In 
presence of weak non-Gaussianity, it is relatively straightfor- 
ward to repeat the steps illustrated in the previous section 
for the peak, rather than the pixel ensemble. In fact, once 
the one- and two dimensional non-Gaussian PDFs are known 
(equations [15] and \21\ , one only needs to impose an extra 
condition in order to select local maxima, but much of the 
logic remains the same. Hence, analytic expressions for the 
number density and for the clustering strength above/below 
threshold can be obtained for the peak statistics as well. We 
present a more detailed investigation of the peak clustering 
statistics, extended to non-Gaussian models, in a forthcom- 
ing publication; for an exhaustive treatment of the Gaussian 
case see instead Bond & Efstathiou (1987). 

Similarly, other topological or geometrical estimators 
which utilize information concerning the morphology of 
the density structure are also directly related to the ex- 
cursion set statistics. This is for example the case of the 
Minkowski functionals (Schmalzing & Gorski 1998; Winitzki 
& Kosowsky 1998; Banday, Zaroubi & Gorski 2000; Hikage 
et al. 2006, 2008b; Matsubara 2010); the number density 
defined in Section [23] is effectively the first Minkowski func- 
tional (i.e. fraction of total area above the threshold), be- 
sides some normalization factors. The genus itself (another 
Minkowski functional) and its derived statistics (Coles 1988; 
Gott et al. 1990; Smoot et al. 1994; CoUey & Gott 2003; Park 
2004; Gott et al. 2007) are also directly related to the ex- 
cursion sets formalism. This is because the genus, being the 
number of isolated hot spots minus the number of isolated 
cold spots, can be obtained from the contours for a given 
threshold temperature and can be parametrized by the area 
fraction above the threshold - which is given by equation 
(|13|l for the pixel ensemble, in the weak non-Gaussian limit. 



3 CONSTRAINING NON-GAUSSIANITY 

WITH THE EXCURSION SET STATISTICS 

In this section we present numerical results for the number 
density and for the clustering strength of pixels above/below 
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Figure 3. Number density of pixels in a Gaussian case (left panels) or in /nl models (in the middle panels /nl = 100; in the right ones 
/nl = 500). A Gaussian smoothing with FWHM=30' and 60' is applied in the central and lower panels, respectively, at N^\^^ = 512. 
Errorbars are the Icr run-to-run estimates from 200 maps. Solid curves are theory predictions from equations II12II . JTSj and II14I I. 



threshold, calculated from non-Gaussian simulations. We 
also perform a thorough statistical analysis to evaluate 
the sensitivity of the two observables to the level of non- 
Gaussianity and to the smoothing resolution. Our calcula- 
tions are averaged over 200 full-sky CMB realizations, and 
the associated errorbars are the Icr run-to-run estimates. 
The effects of noise and of other experimental artifacts, as 
well as confusion effects due to secondary non-Gaussianities, 
are not addressed in this analysis; rather, in this first work 
our main goal is to characterize the intrinsic non-Gaussian 
CMB signal within the excursion set statistics. However, in 



the Appendices [B] [C] and [D] we briefly explain how to in- 
clude these complications in our theoretical framework. 

3.1 Number density of the excursion set pixels 

Figure [3] shows the variation with /nl of the pixel number 
density above (below) threshold, normalized by the expec- 
tation from a Gaussian theory (equation [12}. No smoothing 
is applied in all the top panels, while a Gaussian smooth- 
ing with FWHM of 30 and 60 arcmin is applied in the in- 
termediate and bottom panels, respectively, at a resolution 
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Nside = 512. Solid curves are analytic predictions for /nl 
type non-Gaussianity from equations (|12I14|) : they are in 
very good agreement with our numerical results. Note that 
due to a relatively small number of maps considered, in prac- 
tice at higher thresholds the Gaussian mean undergoes a 
small shift because of statistical fluctuations. We have ac- 
counted for this effect in our calculations, and shifted all the 
corresponding non-Gaussian means by the same amount, as 
done in Chingangbam & Park (2009). This is the reason why 
in the figure and in the following ones (|4][9]) all the Gaussian 
expectations lie exactly on a line. Clearly, this procedure 
does not affect the relative distance between Gaussian and 
non-Gaussian means, the quantity we want to characterize 
here; hence, our results are independent of this small rescal- 
ing. 

A number of interesting features can be inferred from 
Figure O First, the existence of two regions where the 
non-Gaussian contribution appears to be more significant, 
namely at relatively low thresholds = 0.25, 0.50) or 
around u — 2.00. Second, the fact (never pointed out so far 
in the literature) that there are optimal thresholds which 
maximize the local non-Gaussianity, as well as others which 
do not allow for a distinction between the Gaussian and the 
non-Gaussian case. This is expected from equations (|13p 
and (|14[) : in particular, when v — 1 then n^^^ = and 
ripi^ = Mpi,j. Therefore, in this statistical framework levels 
around i/ = 1 are not sensitive to departures from Gaus- 
sianity of the /nl local type. Third, at higher thresholds a 
positive /nl causes an enhancement of the number density 
of the cold pixels and reduces that of the hot ones, while 
the opposite trend happens when u < 1. This effect is more 
evident for larger /nl values. We will use these findings to 
devise a new optimal statistical test in the next subsection. 
An additional Gaussian smoothing increases the errorbars 
in the number density calculations, and slightly reduces the 
effect just described. 

Figure 3] displays the difference (left panels) and the ra- 
tio (right panels) between the number density of hot and 
cold pixels, at corresponding temperature thresholds, when 
no smoothing is applied. Top panels highlight the case of 
/nl = 100, bottom panels are for /nl = 500. Solid curves 
show the analytic predictions, which are easily derived from 
equations H13p and p4|) . Again, we find a very good agree- 
ment between numerical results and analytical expectations. 
At !^ = 1, a 'transition area' in the number density is clearly 
visible, particularly when we consider the difference between 
hot and cold excursion set regions. 



3.2 Statistical test derived from the number 
density 

The conclusions drawn from Figures [3] and 0] can be ex- 
pressed in a more quantitative form as follows. If we assume 
ripix to be the possible non-Gaussian discriminator, we can 
plot the number density measurements in terms of their er- 
rorbars. In other words, we can normalize all the points in 
Figures [3] and [4] by their run-to-run associated errors, and 
quantify their 'distance' from the expected Gaussian pre- 
dictions. This is quite convenient, as it allows one to re- 
alize which thresholds are particularly sensitive to a local 




1^= dT/a v= dl/a 



Figure 4. Difference (left) and ratio (right) between hot and cold 
excursion set regions, at corresponding temperature thresholds, 
when no smoothing is applied. In the top panels /nl = 100, in the 
bottom ones /nl = 500. Solid lines are theoretical expectations 
derived from equations I I13I I and II14I I. At = 1, a transition area 
in the number density is clearly visible. 



non- Gaussian signal, and to determine the exact values of v 
which maximize departures from Gaussianity. 

In Figure [5] we show the case when no smoothing is 
applied and reinterpret in this context the number density 
per Gaussian units (left panel), the difference (middle panel) 
and the ratio (right panel) between the abundance of hot 
and cold pixels. Shaded areas represent the 1 and 2cr errors, 
while different symbols are used for different values of /nl , as 
specified in the plots. When v = 0.25, 0.50 or u = 2.00, 2.25 
departures from Gaussianity are maximized: they exceed the 
la level for /nl = 100. Instead, the transition area around 
I' = 1 is insensitive to a non-Gaussian signal of the /nl 
type. At higher thresholds, departures are more significant 
with increasing /nl; however, severe pixel-noise and poor 
statistics (too few excursion sets) prevent them from being 
reliable non- Gaussian indicators. 

Although the sensitivity of the first skewness param- 
eter S^"-* to /nl, and so of the number density itself, is 
much worse than that of the angular bispectrum (Komatsu 
& Spergel 2001), the previous findings suggest that we could 
construct a derived quantity which amplifies the /nl contri- 
bution. This is achieved by combining two thresholds, where 
departures from Gaussianity are most significant. Namely, 

NG + - /otN 

"-he ^n^^^-n^^^ (27) 
where 

<c = <ix (;^ = 0.50) - n^G(i. = 2.00) (28) 

Kc = <ix (;^ = -0.50) - n^Siu = -2.00). (29) 

Figure [6] shows measurements of this quantity from the sim- 
ulations, as a function of the smoothing scale adopted. The 
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Figure 5. Reinterpretation of Figure [3] (left panel) and Figure |4] (middle and right panels) in terms of the run-to-run associated errors, 
in order to quantify the sensitivity of the number density to local non-Gaussianity. Different values of /nl are displayed, as indicated 
in the plots, when no smoothing is applied. Departures from Gaussianity are maximized around v = 0.25, 0.50 or around v = 2.00, 2.25 
while areas close to u = 1.00 are insensitive to non-Gaussianity of the local type. 




20 40 60 20 40 60 

Smoothing [arcmin] Smoothing [arcmin] 

Figure 6. Composite quantity for the pixel number density as a function of the smoothing scale, defined by equation II27I I in the main 
text. The left panel is in real units, with errorbars estimated from 200 realizations; the right panel shows similar quantities but in RMS 
units as in Figure [S] The two areas where the non-Gaussian sensitivity is maximized (i.e. \u\ = 0.50 and \u\ = 2.00) are combined, in 
order to boost the departure from Gaussianity. 



left panel is real units, the right panel is in errorbar units 
as in Figure [5] By combining the two optimal levels the sen- 
sitivity slightly improves, but still remains at the la level 
for /nl ~ 100. This is because the associated errorbars at 
different thresholds are correlated. 

3.3 Clustering strength of the excursion set pixels 

Next, we consider the clustering strength of the excursion 
set regions. We analyze in detail the case ol v = 2.00, where 
according to our previous findings the sensitivity to a local 
/nl type non-Gaussianity is maximized. 

Figure [7] shows measurements of the hot and cold pixel 



correlations above/below threshold from the simulations. 
Left panels highlight the Gaussian case, intermediate panels 
are for /nl ~ 100, and in the right panels /nl = 500. A 
smoothing with FWHM of 30 and 60 arcmin is applied in 
the central and bottom panels, respectively. Solid lines are 
analytic predictions in the Gaussian limit, i.e. equation (|24p . 

The various unweighted correlation functions are calcu- 
lated as explained in Rossi et al. (2009). In particular, the 
number of random pairs is computed by distributing ran- 
dom points on a unit sphere, and then by pixelizing them 
in the HEALPix scheme at the same resolution of the maps. 
The random realization used contains at least 10 times more 
points than the simulated samples. All the errorbars are es- 
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Figure 7. Clustering strength of pixels above/below threshold when \u\ = 2.00, a regime particularly sensitive to a non-Gaussian signal 
of the local /nl type. Left panels represent the Gaussian case, in the middle panels /nl = 100, in the right panels /nl = 500. A Gaussian 
smoothing with FWHM of 30 and 60 arcmin is applied in the middle and bottom panels, respectively. Solid lines are analytic predictions 
in the Gaussian limit, from equation II24I I. 



timated directly from 200 realizations. Note that the scatter 
between different runs can be quite significant, especially at 
relatively large angular scales. 

From Figure[7]it is evident that, for a positive and large 
/nl value, the clustering of the cold pixels is enhanced with 
respect to that of the hot ones. This peculiar feature is a 
distinct signature of non-Gaussianity of the local /nl type. 
It is most prominent at angular scales of about 8 — 75', 
the Doppler transition due to the sharp turn-down in the 
power spectrum at I ~ 1500 (because of the thickness of 
the last scattering surface). The asymmetry in the excur- 
sion set clustering is not surprising: it is expected from the 
corresponding number density behavior (Figures |3] and |4]) , 
and from the shape of the non-Gaussian potential (equation 



[T}. In fact, at thresholds u > 1 the cold pixel abundance is 
amplified with /nl, while the number density of the hot pix- 
els is reduced. This causes the difference in the clustering. 
Also, the quadratic term in ([ij is insensitive to a change in 
sign, hence the asymmetry between hot and cold regions. 
However, when /nl = 100 this feature is small, as shown 
in the middle panels of Figure [7] Therefore, at A^sidc = 512 
the Gaussian correlation function of the excursion sets is 
not easily distinguishable from a non-Gaussian one, if the 
model of non-Gaussianity is of the local /nl type. Turning 
the argument around, the excursion set clustering statistic 
does not provide accurate constraints on Gaussianity itself 
(at least with this particular non-Gaussian model in mind), 
contrary to what was previously thought (i.e. Kogut et al. 
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1995; Barreiro et al. 1998; Heavens & Sheth 1999; Heav- 
ens & Gupta 2001). Working at higher resolution would be 
more advantageous, as at A'sidc ~ 512 the predicted error- 
bars from our simulations are quite large - although they are 
rather pessimistic estimates, because they are based on 200 
runs only. Moreover, smoothing the maps has a more dra- 
matic effect on the clustering of the excursion set regions, 
rather than on their abundance: the larger-scale power is 
suppressed, while the small-scale strength is enhanced. This 
results in an overall suppression of the clustering feature 
previously described, particularly when /nl = 100 (see the 
bottom panels in Figure [7}. Note that as we increase the 
smoothing, the clustering of hot and cold pixels makes a 
transition between FWHM 30 and 60 arcmin, and the clus- 
tering behavior is reversed. At larger smoothing values, we 
expect the hot pixels to cluster more at larger 9. This in- 
dicates that large values of FWHM will still be useful for 
comparison of this quantity with real data. We are address- 
ing this issue in a forthcoming paper, where we deal with 
the detectability of these non-Gaussian features from a real 
dataset. 

While in Figure [7] solid lines are analytic predictions 
in the Gaussian limit from equation (|24|l , in Appendix |X] 
we show an example of how well the Edgeworth approx- 
imation (i.e. equation I21|l works for the clustering, when 
/nl = 100. In equation (|2ip . the second and third terms in- 
side the square bracket become important when /nl is non- 
zero. While S'"' is straightforward to compute, the term A is 
much more complicated since it involves the computation of 
the full-sky three-point function, and it is beyond the scope 
of this paper. However, we find that ignoring the A term 
still gives relatively good agreement with the results from 
the simulations. 

In Figure |8] we show the clustering difference (left pan- 
els) and ratio (right panels) between hot and cold excursion 
set regions, for two significant values of /nl. No smoothing 
is applied. This is done in parallel with the number density 
case (Figure 21). When /nl ~ 100, the estimated sizes of our 
errorbars suggest again that a clustering analysis is not ideal 
to detect departures from Gaussianity, although the behav- 
ior at /nl = 500 is quite peculiar. In the next subsection we 
propose an optimized statistical test based on the clustering 
strength, which aims at maximizing the difference between 
the Gaussian and the non-Gaussian cases. 



3.4 Statistical test derived from the clustering 
strength 

The conclusions drawn from Figures [7] and [8] can be ex- 
pressed in a more quantitative form, as was done for the 
number density in Section 13.21 Assuming now £,v{0) to be 
the non-Gaussian discriminator. Figure [9] shows in errorbar 
units the correlation strength of the excursion set regions 
per Gaussian expectations (left panel), the difference (mid- 
dle panel) and the ratio (right panel) between the clustering 
of hot and cold pixels when \v\ — 2.00. No smoothing is 
applied. Shaded areas represent the 1 and 2a zones, while 
different symbols are used for different values of /nl, as 
specified in the plots. When /nl = 100, departures from 
Gaussianity lie always below the la level, unlike for the 
abundance case (see Figure [5] for a direct comparison). The 
situation does not improve significantly if we consider the 
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Figure 8. Clustering difference (right panels) and ratio (left pan- 
els) between hot and cold excursion set regions at \u\ = 2.00, 
when no smoothing is applied. In the top panels /nl = 100, in 
the bottom ones /nl = 500. See the main text for more details. 



clustering difference or ratio between hot and cold patches. 
Only when /nl ~ 500 there is a noticeable effect, which 
exceeds 2a around the Doppler scale, at ^ ~ 75 arcmin. 

The scatter in the clustering strength is mainly due to 
cosmic variance, which causes large fluctuations among dif- 
ferent full-sky realizations (m- and ^-modes). As a result, 
errorbars are large. To minimize its effect, we propose a sta- 
tistical test, which involves the clustering information alone. 
The procedure can be summarized as follows. 

(i) Consider a non-Gaussian CMB temperature map and 
extract its power spectrum. 

(ii) Use equation (|24|) to compute the corresponding an- 
alytic expectation for the pixel correlations above/below a 
threshold v, as if the map were thought to be Gaussian. De- 
note this quantity as ^J^caussi it is the same for hot and cold 
excursion set regions, in the Gaussian statistics. 

(iii) Compute the hot and cold correlation functions ^J^^ 
and ^^^^ directly from the map, at the same threshold level. 

(iv) Construct the quantities: 



fNG _ /^NG 



fNG N /f NG 
Si^,GaussJ/ Si^, Gauss 



fNG 



fNG 

Si^, Gauss J / Si 



NG 
Gauss 



(30) 
(31) 



and plot them as a function of the angular separation 0. 

(v) Repeat this procedure for the entire set of non- 
Gaussian maps, with different /nl values. 

Results of these measurements are shown in the left part 
of Figure [TD] /nl = 100 in the top panels, /nl ~ 500 in the 
bottom ones. By subtracting the power spectrum contribu- 
tion in the numerator of (|30|l and pi[) . the cosmic variance 
is partially reduced because fluctuations in the correspond- 
ing ^-modes are cancelled. Unfortunately, the scatter in the 
m-modes is still quite significant, so that at /nl ~ 100 it 
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Figure 9. Reinterpretation of Figure [7] (left panel) and Figure |8] (middle and right panels) in terms of the run-to-run associated errors, 
in order to quantify the sensitivity of the excursion set clustering strength to local non-Gaussianity. Different values of /nl a^re displayed, 
as indicated in the plots, as a function of the angular separation 9. No smoothing is applied. Only when /nl is of the order of 500 there 
is a noticeable difference in the clustering, which exceeds the 2cr level around the Doppler scale {8 ~ 75') when = 2.00. 
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Figure 10. Statistical test which involves the clustering information alone, as explained in the main text. [Left] Measurements of the 
quantities 1 130 I I and JSTJ from the non-Gaussian simulations, with the corresponding Gaussian theoretical prediction ^J)'QJ^^J,J, estimated 
from the same maps. [Right] Same as the left panels, but now CJ^QJJ^J,J, is computed from the corresponding Gaussian maps with the 
same random seeds of the non-Gaussian ones. In this way, the cosmic variance effect is completely cancelled. 



is not possible to distinguish a weakly non-Gaussian signal 
from a Gaussian one using the clustering information alone. 

The idealized situation is presented in the right part 
of the same figure. Here, we replace ^J^causs by its direct 
measurement from the corresponding Gaussian map with 
the same random seed. In other words, we do not use equa- 
tion (|30|l and the non-Gaussian power spectrum to deter- 
mine C^Gauss- Instead, we produce a Gaussian map with the 
same random seed of the non-Gaussian one, and use it to 
compute the theoretical expectation. This is repeated for 



the entire set of non-Gaussian realizations. The procedure 
is essentially the CMB analogous of what has been proposed 
by Seljak (2009) for the LSS. In fact, in this way the cosmic 
variance effect is completely cancelled; even at /nl = 100, 
a clustering analysis would then provide ~ 5% difference at 
the Doppler peak with respect to the Gaussian case. 

The situation described here is clearly ideal, because 
only the first procedure (left panels in Figure llOjl can be 
performed from a real dataset. However, it provides some 
important insights: cosmic variance is the real limit and ob- 
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stacle within this statistical framework. If we could somehow 
control the fluctuations in the m-modes, then the excursion 
set analysis would provide a powerful tool to detect non- 
Gaussianity. The problem is that the CMB alone does not 
allow one to compare 'tracers' at different epochs (as for 
the LSS case), and so to eliminate completely the effect of 
cosmic variance. 



4 CONCLUSIONS 

We have extended and applied the statistics of the excur- 
sion sets to models with primordial non-Gaussianity of the 
/nl local type. While in presence of Gaussian initial condi- 
tions many statistics based on geometrical and topological 
properties of the CMB temperature have been developed 
and well-studied, to date fewer analyses have been focused 
on geometrical properties of the CMB radiation in the pres- 
ence of primordial non-Gaussianity. In particular, our work 
is the first extension of the excursion set formalism to local 
/nl type non-Gaussianity. 

From a large set of simulated full-sky non-Gaussian 
maps, we computed the number density and the spatial clus- 
tering of CMB patches above/below a temperature thresh- 
old (Section [3}. We found that a positive value of /nl en- 
hances the number density of the cold CMB excursion sets 
(Figures O [4| along with their clustering strength (Figures 
[T] ID) and reduces that of the hot ones. 

We performed a thorough statistical analysis to eval- 
uate the sensitivity of the two observables to the level of 
non-Gaussianity and to the smoothing resolution. We also 
provided the analytical formalism to interpret our results 
(Section [2| . Expressions for the one- and two-dimensional 
PDFs (Eauations ll5l and l21|) were obtained from a perturba- 
tive approach by the multidimensional Edgeworth expansion 
around a Gaussian distribution function, and used to char- 
acterize the abundance and clustering statistics as a func- 
tion of /nl (Equations [T31 \W\ We showed that there 
are optimal thresholds which maximize the local /nl non- 
Gaussianity — 0.25,0.50 and v — 2.00,2.25), as well as 
others {u = 1.00) which do not allow for a distinction be- 
tween the Gaussian and the non-Gaussian signals (Figures[S] 
and[9]). We devised a new statistical test based of the number 
density (Section 13. 2[l . which combines two thresholds where 
departures from Gaussianity are most significant (Figure [6] 
and Equation [27|. We also proposed a new procedure aimed 
at minimizing the effect of cosmic variance (Section 13. 4p . 
which involves the clustering information alone (Figure [TOl 
Equations [30] and ^ . 

Although we focused here on /nl models of the local 
type, the statistical tools developed are more general and can 
be applied to describe any other type of non-Gaussianity. A 
typical example is represented by the curvaton model, for 
which the cubic term indicated as (/nl can be large, while 
/nl can be negligible. Our technique can be applied to this 
case as well, and it is the subject of a forthcoming publica- 
tion. 

This work was primarily motivated by our previous find- 
ing (Rossi et al. 2009), namely a remarkable difference in 
the clustering of hot and cold pixels at relatively small an- 
gular scales from the WMAP 5-yr data. We analyzed the 
possibility that this discrepancy may arise from primordial 



non-Gaussianity of the local /nl type (Section l3.3p . and con- 
cluded that only a large value of /nl would provide such a 
difference (Figure [T}. Cosmic variance plays a crucial role 
within this statistical framework, so that the Gaussian cor- 
relation function of the excursion sets is not easily distin- 
guishable from the non-Gaussian one, contrary to what was 
previously thought. In fact, while a distinct signature in the 
clustering of hot and cold pixels clearly emerges for a large 
/nl non-Gaussianity, particularly at angular scales of about 
75 arcmin (around the Doppler peak), as expected this fea- 
ture is reduced when /nl = 100. The clustering behavior is 
also strongly affected by the smoothing angle. These findings 
suggest that Gaussianity itself cannot be accurately con- 
strained from the excursion set clustering statistics. In fact, 
if in principle the use of pixel-pixel correlation functions as 
a test of Gaussianity is very powerful, because there are 
no free parameters once the underlying power spectrum has 
been measured, this may not be the case if the non-Gaussian 
model is of the /nl local type, and /nl is small. 

Our study was focused on a few selected values of 
thresholds and two different statistics, so that the predicted 
constraints on /nl are wider than what one would get by 
combining several threshold levels and different smoothing 
angles. In this respect, our predicted constraints from the ex- 
cursion sets are compatible with those of Smidt et al. (2010), 
obtained from the trispectrum. 

Since cosmic variance is the main obstacle in the analy- 
sis, we are considering derived statistics which could poten- 
tially beat its effect and maximize the non-Gaussian contri- 
bution. It is also important to adopt different and comple- 
mentary statistical approaches, and not just a single view, 
because there is no such statistics which describes fully and 
uniquely the non-Gaussian nature of a sample. To this end, 
a lot of effort has recently gone into developing optimal es- 
timators, and in this sense our statistical technique belongs 
to a class of topological estimators which may be considered 
"sub-optimal" for measuring non-Gaussianity. However, in 
reality all the geometrical methods complement and "diag- 
nose" results obtained with bispectrum or trispectrum esti- 
mators. Moreover, geometrical techniques are often model- 
independent, easy to implement, with low computational 
cost, and they can retain information on the spatial distri- 
bution of the non-Gaussian signal. Also, they provide useful 
analytic insights and physical intuition. For example, the 
derivation and implementation of the analytical formula for 
the CMB Minkowski functionals in the limit of weak non- 
Gaussianity (Hikage, Komatsu & Matsubara 2006; Matsub- 
ara 2010) has allowed to obtain limits on various models, 
for which the optimal estimators are difficult to implement; 
at the moment, a limit on the primordial non-Gaussianity 
in the isocurvature perturbation is available only from the 
Minkowski functionals (Hikage et al. 2009). Note also that 
the concept of "optimal" is often misleading, as it requires 
a posteriori knowledge of the type of non-Gaussianity which 
is, at least in principle, unknown. The main question, in- 
stead, is whether or not it is possible to improve limits on 
/nl using the CMB data only. 

Including realistic effects in our simulations, such as 
inhomogeneous noise, point source contamination or fore- 
grounds, so that we can compare our predictions with cur- 
rent observations, is subject of ongoing work (we provide 
some discussion in Appendices [B] and [Dj . We present re- 
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suits of these investigations in a companion paper, where 
we are also consider more terms in the expansion IT}. Ap- 
plication of the formalism presented in Section [2.31 to peak 
rather than pixel statistics is a straightforward exercise, and 
is also the subject of another forthcoming publication. 

The Planck satellite with its increased sensitivity and 
resolution is expected to improve the measurements of 
most cosmological parameters by several factors compared 
to WMAP, and in synergies with future galaxy surveys 
(Colombo, Pierpaoli & Pritchard 2009). In fact, Planck gains 
a factor of 2.5 in angular resolution and up to 10 in instan- 
taneous sensitivity with respect to WMAP, and it is nearly 
photon noise limited in the CMB channels (100-200 GHz). 
Repeating this analysis at the Planck resolution may then 
provide more stringent limits on /nl from the excursion set 
statistics, and is also the subject of work in progress. 
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APPENDIX A: ANALYTIC NON-GAUSSIAN 
PREDICTIONS FOR THE EXCURSION SET 
PIXEL CLUSTERING 

The multidimensional Edgeworth expansion is a convenient 
way of approximating a PDF in terms of its cumulants. It 
is a true asymptotic expansion, so that the error is well- 
controlled; it can be used to describe weak non-Gaussianity. 

Figure ET1 shows an example of how well the Edgeworth 
approximation works for the CMB excursion set cluster- 
ing. Points in the figure are hot and cold pixel correlations 
above/below a threshold \u\ = 2.00 from the simulations, 
when /nl ~ 100 and no smoothing is applied. Solid line is 
the Gaussian analytic prediction from equation (|24|) : dotted 
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Figure Al. Example of how well the Edgeworth approxima- 
tion works in describing the CMB excursion set clustering when 
/nl = 100. Points are hot and cold pixel correlations above/below 
{ul = 2.00 measured from the simulations, when no smoothing is 
applied. Solid line is the Gaussian analytic prediction (equation 
I24| l. Dotted lines are the non-Gaussian analytic expectations ob- 
tained by using equation I I21I I and by ignoring the A term. The 
agreement between numerical results and theory predictions is 
reasonably good only at large angular scales 9. 



lines are the non-Gaussian analytic expectations obtained by 
using equation (|2ip in (|18|) and (|16p . and by ignoring the A 
term. The agreement between numerical results and theory 
predictions is still good at large angular scales d. However, 
for small values of 6 we expect A — >■ (tS'^"' , hence this term 
becomes important in (|21|l : this is why in Figure ET1 the an- 
alytic prediction fits poorly in that regime. On the opposite, 
at higher threshold levels and when aS'^'^^ becomes large the 
Edgeworth expansion cannot be used. 



APPENDIX B: INHOMOGENEOUS NOISE 
AND PARTIAL SKY COVERAGE 

The number density and the correlation strength of pixels 
above/below a temperature threshold v can be generically 
expressed by 

"pix(^) - ■ Pi, (Bl) 

1 + U0) = P2/P?, (B2) 

where A'^pix.tot is the total number of pixels, and Pi and P2 
are defined in equations (|17|l and (|18|) . By inserting the cor- 
responding one- and two-dimensional PDFs in those equa- 
tions, and by using their output in (|B1[) and (|B2|l . one can 
readily characterize the pixel number density and the clus- 
tering statistics above/below threshold in the fully Gaussian 
case or the weak non-Gaussian limit (eg nations 1 1 2ft24)) . 

The analytic expressions derived in the main text apply 
to full-sky intrinsic CMB signal; the effect of noise is not 
included. However, with the formalism introduced by Rossi 
et al. (2009) we can also describe analytically the excursion 
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set clustering in the weak non-Gaussian limit, in presence 
of inhomogeneous noise. Maintaining the same notation, we 
indicate the observed value in a pixel hy D = T — (T) = 
ST = s + n, which is the sum of the true signal s plus 
noise n, both of which have mean zero. We consider a model 
in which the signal is homogeneous and may have spatial 
correlations whereas the noise, independent of the signal, 
may be inhomogeneous and have spatial correlations. We 
denote p{D) the observed one-point distribution of D, p{s) 
the distribution of s with rms as, p(o"n) the distribution 
of the rms value of the noise in a pixel, and p{n\an) the 
distribution of the noise when the rms value of the noise is 
(Jn- The one-point observed distribution is 



piD) 



ds p{s) / dn p{n)Sj:i{s + n — D) 



ds p{s) J '^^ J '^'^n P(^^kn) p(crn) 

X S-d{s + n = D) 

dan p{an) / ds p{s) p{D - s\an) 



dan p{an) p{D\an), 



(B3) 



where Su is the Dirac delta. The fraction of pixels above 
some temperature threshold Dt is 



/(A 



dDp{D)^ danp{an) / dDp{D\an) 



(B4) 



danp{an) f{Dt\an)- 



Similarly, for two pixels separated by the angular distance 
9, or having correlation w = w{9), the two-point observed 
distribution is specified by: 



p{Di,D2,w) = J dsi J ds2 p(si,S2,m) J dm J dn2 
X p(ni,n2) (5d(si + ni = Di) 5d(s2 + n2 ^ D2) 
= / dsi / ds2 p{s\, 82,110) f dn\ I dn2 I dai 



X j da2 p(ni,n2\ai,a2) p{ai,a2,w) So{si + m = Di) 
X (5d(s2 + n2 ^ D2) 

dax j da2p{ai,a2,w) j dsi j ds2 p{si , S2 , w) 
X p(Di - si|cti)p(_D2 - S2\a2) 
= j dcTi j da2p{ai,a2,w)p(Di,D2,w\ai,a2) (B5) 
where 

p{Dx,D2,'w\ai,a2) = J dsi J ds2 p{si, S2,w) 

X p(-Dl - Si|cti)p(D2 - S2|(T2). (B6) 

Since /x = 5T/a = s/as, where fj, is the variable used in 
the main text to indicate the threshold level, then p{s)ds = 
p(/n)d/i and p{si, S2, w)dsids2 = p{iJ,i, fJ.2,'w)dnidfj,2. There- 
fore we can use the PDFs (|15p and (|2ip to characterize (|B3|) , 
l|B4p and (|B5|) in the weak non-Gaussian limit, when inho- 
mogeneous noise is present. Once (|B3|) . (|B4|) and (|B5|I are 



known, then the pixel number density and the clustering 
above/below threshold can be inferred from l|Bip and (|B2|) . 
where now 



A = I p{D)dD = f{Dt) 

'Dt 



and 



dDi / dD2 p{Di,D2,w) 



(B7) 



(B8) 



The corresponding Gaussian limiting case has been pre- 
sented in detail in Rossi et al. (2009). In particular, if 
p{si,S2,w) is bivariate Gaussian with {s\) = {s'2) — ag, 
{S1S2) = Cs{0) as defined in equation (|26|l . and the noise 
p(n\an) is Gaussian with variable rms a^, then 



p{D^,D2,w\ai,a2) = ^=e-^°^-'=' 



(B9) 



27rcr^Viaia2 



exp 



Q2-D? + aiDl~ 2w D1D2 
2a^ {aia2 ~ m^) 



with Qi = -I- af)/al,, Q2 = (cr| + a2)/a^, and 



CN{e) = y ^Lt21 cf wr""'"" Pi (cos e). 



(BIO) 
(Bll) 



where C is the covariance matrix of the temperature field, CTq 
the variance of D, Cf the power spectrum of the noise map, 
and the additional smoothing due to finite pixel 

size, optional Gaussian beam smoothing and mask infiuence. 
Note in fact that in presence of incomplete sky coverage 
one needs to add an extra window function in H26[) and in 
(|B11|) . according to the geometry of the survey, to account 
for extra-correlations introduced by the mask. If the noise is 
spatially uncorrelated, then clearly Cn(^) = and therefore 
vj = Cs{d)/a^. In the approximation where o"i = a2, rms 
noise varies spatially on scales much larger than those of 
interest, then ai = 02. The "standard" approximation, rms 
noise independent of position, has Qi = Q2 = 1. 

Properly characterizing all these experimental compli- 
cations, when primordial non-Gaussianity is assumed, is the 
next step in our analysis; it will be presented in a forthcom- 
ing publication. In particular, in order to compute (|B3|) . 
(|B4|I and HB5[) in the weak non-Gaussian limit, a detailed 
knowledge of the noise distributions is required; namely, 
p{an), p{n\an) and their corresponding two-dimensional ex- 
pressions must be specified. For example, those PDFs can 
be directly measured from a real dataset and/or given by 
the specifics of the experiment (i.e, WMAP, Planck, etc.), 
as was the case in Rossi et al. (2009). 

However, it is straightforward to predict what happens 
in the presence of Gaussian white noise (independent of po- 
sition, with rms ctn). In fact, in this case the effective rms of 
the CMB map increases; it is given by ct = ud = a"| -I- cr^ . 
Hence, there will be a slight shift in the threshold level 
1/ = D/aY>, but all the equations derived in the main text 
are still applicable - provided that one replaces as with the 
effective rms of the map, ctd. Handling inhomogeneous noise 
is more complicated and will be presented separately, as the 
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overall effect on the pixel number density and clustering crit- 
ically depends on the detailed characteristics of the noise. 



APPENDIX C: 
ESTIMATES 



ANALYTIC ERROR 



For a Gaussian random field, the uncertainties in the pixel 
number density and in the correlation function above/below 
threshold can be evaluated analytically from the optimal 
variance limit, which contains cosmic variance, instrumental 
noise, and finite bin size effects. Details can be found in Rossi 
et al. (2009). In essence, the ultimate accuracy with which 
the CMB power spectrum can be determined at each £ is 
given by (Knox 1995): 



ACe = 



(2^+l)/sky 



pix.tot Wl' 



(CI) 



where Wi'^^^'^ is the instrumental window function and /sky 
the fraction of the sky covered by the experiment. The un- 
certainty in the angular correlation function for narrow bins 
in 9 is then: 



ACie) 



dc{e) 



dCe 



AC 



1/2 



£ 



nstr i^smooth\ 2 ^ 



r 27rc; 



+ 



U{£+1) W 



1/2 



(C2) 



where = £{£ + l)C£/27r and Slpix = ^pix is the pixel 
area. If the bin size is not infinitesimal, one needs to make 
a small correction - which is negligible for the scales we 
are interested in (see Rossi et al. 2009 for more details). 
The uncertainties in the correlation function above/below 
threshold are finally derived from: 



AUe) = 



dc{e) 



AC{e). 



(C3) 



An example of how well this analytic relation works in the 
Gaussian limit is shown in Figure ICll by the shaded area. 
The result is compared with numerical estimates (errorbars), 
when the full-sky intrinsic CMB signal is considered, in ab- 
sence of pixel noise. As evident from the figure, the agree- 
ment between theoretical expectations from equation (|C3|) 
and numerical predictions is good. 

In presence of primordial local non-Gaussianity, the sit- 
uation is more complicated. In principle, one can still follow 
the previous steps and derive similar analytic expressions. 
However, when /nl the power spectrum Ce is different 
from the Gaussian case, and the full-sky two-point angular 
correlation function C{9) cannot be expressed as simply as in 
(|26p: one needs to account for extra correlations introduced 
by the primordial non-Gaussianity. If noise is also included, 
the situation is even more complicated - as presented in 
Appendix [B] If we instead neglect all the contribution from 
bispectrum to modify the power spectrum and the angular 
correlation function as for large /nl, then (|C3[) will have an 
additional term proportional to /nl; the extra non-Gaussian 
part can be inferred from the two-dimensional PDF (|2ip . i.e. 
a term proportional to 
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Figure CI. Analytic estimate of the errors for the pixel correla- 
tion function abovc/below threshold (shaded area) derived from 
equation jCS}, in the Gaussian limit. Numerical errorbars are also 
shown; they are in good agreement with the theoretical predic- 
tions. The threshold level is \v\ = 2.00. 
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However, we feel that estimating the errorbars by using 
a large set of simulations is more accurate, for any arbi- 
trary value of non-Gaussianity and in particular when /nl 
is small. This is why we do not attempt to derive analytic 
uncertainties in the weak non-Gaussian limit; rather, our 
approach is to estimate errors using a large set of numerical 
simulations, with some guidance provided by the analytic 
expressions (|CHC3P valid in the Gaussian regime. 



APPENDIX D: SPURIOUS 
NON-GAUSSIANITIES 

Even for standard "optimal" estimators like the bispectrum 
or trispectrum, the problem of non-Gaussianities arising 
from non-primordial sources is very challenging. There are 
many different contaminants which can be confused as pri- 
mordial non-Gaussian signals. Those include (1) instrumen- 
tal effects, such as beam asymmetries, inhomogeneous noise, 
masks or incomplete sky coverage, (2) astrophysical contam- 
inants, such as point sources, foregrounds, presence of voids 
or anomalous cold spots, (3) secondary anisotropics, such 
as the Integrated Sachs- Wolfe effect (ISW) and lensing, and 
so forth. In certain cases the contamination is negligible, in 
other cases it may be severe but one can account for it. There 
are also situations in which the spurious non-Gaussianity is 
hard to account for, or its contribution remains still unclear. 

Regarding instrumental effects, the inhomogeneity of 
the noise is the most critical problem. However, in Appendix 
[Bl we showed how to extend our formalism in presence of 
inhomogeneous noise. Given a good knowledge of the exper- 
imental beams, window functions and noise absolute cali- 
bration, it is possible to separate its constribution from a 
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primordial non-Gaussianity. Another possible contaminant 
is introduced by partial sky coverage: for instance, edge ef- 
fects due to pixels which lie very close to the mask could 
potentially induce undesirable non-Gaussianities, but their 
effect can be carefully modelled. 

As far as astrophysical contaminants, the presence 
of low-density regions in the southern Galactic cap (cold 
spots), and the contribution by unknown point sources or 
anomalous foreground emissions, are all possible sources 
of confusion. Their accurate measurement is therefore cru- 
cial. In particular, uncertainties in the foreground template 
model used for the foreground subtractions may introduce 
anomalies at the percentage level, since Galactic foregrounds 
are non-Gaussian and anisotropic. While an optimal estima- 
tor has a clear framework to asses the amount of contribu- 
tion from secondary sources, in our case we may account for 
point source contaminations in two ways: theoretically, and 
by using simulations. For example, from a theoretical point 
of view, the following calculation shows how to quantify for 
the contamination in the correlation function. Denote with 
the subscripts P a point source, and with T the pixel tem- 
perature; use N for the respective number of pairs. Assume 
no correlations between point sources (i.e. ^pp — 0) and ne- 
glect possible cross-correlations (i.e. ^tp = 0). The overall 
observed unweighted correlation function is then: 



1 + ^obs = 



A^tt(1 + Ctt) + A/'pp(l + frp) + jVTp(l + Ctp) 
jVtt -I- iVpp -I- Ntp 

Ntt{1 + Ctt) + iVpp + Ntp 



Ntt + Npp + Ntp 

iVTT(l + $Tt) 



Ntt + Npp + Ntp 
7Vtt(1 -|- Ctt) 



-I- 



iVpp -I- JVtp 



Ntt + Npp + Ntp 



Ntt + Npp + Ntp 
since Ntt ^ iVpp -|- Ntp- Therefore one can write: 

1 + ^obs ^ 7(1 + ^tt) 



(Dl) 



where 



7 = 



Nti 



(D2) 



(D3) 



Ntt + Npp + Ntp 

and A^TT = nT(nT — l)/2, A^pp = np(np — l)/2, A^tp = 
TiTup, with riT the number of effective temperature pixels 
and Up the number of spurious undetected point sources 
(equivalent to bad pixels). With a simulation approach, we 
can also quantify very accurately the contamination induced 
by point sources. This is achieved by adding point sources 
to the mock maps, and by repeating the same analysis as 
for the uncontaminatcd case. A comparison between the two 
situations allows one to quantify the degree of contamina- 
tion. 

Spurious non-Gaussianities could also arise from sec- 
ondary anisotropics, such as gravitational lensing, cosmic 
rcionization, Sunyacv-Zcl'dovich, Sachs- Wolfe or Ostriker- 
Vishniac effects (see Komatsu 2010 for a recent review). 
Phase transitions in the early Universe may also introduce a 
new source of non-Gaussianity, difficult to disentangle from 
a primordial non-Gaussian signal. In particular, the most 
serious contamination of the local /nl model is represented 
by the coupling between the ISW and the weak gravitational 



lensing: in fact, the coupling between small and large scales 
creates a local form bispectrum of non-primordial origin. 
Recently, Hanson et al. (2009) have shown that the lensing- 
ISW coupling can cause a bias in the /nl parameter on the 
order of A/nl — 10. However, in general the primordial 
non-Gaussian signal can be separated from non-Gaussian 
secondary anisotropics on scales relevant for WMAP and 
Planck. 

All these effects are of course very important, before one 
can claim a pure detection of a pzrimordial non-Gaussianity. 
We are planning to address all these issues in detail, when 
we apply our techniques to a real dataset. 



